Optimal Clustering Scheme For Repeated Bisection Partitional Algorithm

نویسندگان

  • Kalyani Desikan
  • Hannah Grace
چکیده

Text clustering divides a set of texts into clusters such that texts within each cluster are similar in content. It may be used to uncover the structure and content of unknown text sets as well as to give new perspectives on familiar ones. The focus of this paper is to experimentally evaluate the quality of clusters obtained using partitional clustering algorithms that employ different clustering schemes. The optimal clustering scheme that gives clusters of better quality is identified for three standard data sets. Also, the ideal clustering scheme that optimizes the I2 criterion function is experimentally identified.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Partitional Algorithms for Clustering Medical Documents

There are large quantities of information about patients and their medical conditions. The discovery of trends and patterns hidden within the data could significantly enhance understanding of disease and medicine progression and management by evaluating stored medical documents. Methods are needed to facilitate discovering the trends and patterns within such large quantities of medical document...

متن کامل

خوشه‌بندی خودکار داده‌ها با بهره‌گیری از الگوریتم رقابت استعماری بهبودیافته

Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Genetic Algorithms in Partitional Clustering: A Comparison

Three approaches to partitional clustering using genetic algorithms (GA) are compared with k-means and the EM algorithm for three real world datasets (Iris, Glass and Vowel). The GA techniques differ in their encoding of the clustering problem using either a class id for each object (GAIE), medoids to assign objects to the class associated with the nearest medoid (GAME), or parameters for multi...

متن کامل

Computation of Initial Modes for K-modes Clustering Algorithm Using Evidence Accumulation

Clustering accuracy of partitional clustering algorithm for categorical data depends primarily on the choice of initial data points to instigate the clustering process and hence the clustering results cannot be generated and repeated consistently. In this paper we present an approach to compute initial modes for K-mode partitional clustering algorithm to cluster categorical data sets. Here we u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013